{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/embeddings/nebius.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Nebius Embeddings\n",
    "\n",
    "This notebook demonstrates how to use [Nebius AI Studio](https://studio.nebius.ai/) Embeddings with LlamaIndex. Nebius AI Studio implements all state-of-the-art embeddings models, available for commercial use."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, let's install LlamaIndex and dependencies of Nebius AI Studio."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index-embeddings-nebius llama-index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Upload your Nebius AI Studio key from system variables below or simply insert it. You can get it by registering for free at [Nebius AI Studio](https://auth.eu.nebius.com/ui/login) and issuing the key at [API Keys section](https://studio.nebius.ai/settings/api-keys)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "NEBIUS_API_KEY = os.getenv(\"NEBIUS_API_KEY\")  # NEBIUS_API_KEY = \"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's get embeddings using Nebius AI Studio"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.embeddings.nebius import NebiusEmbedding\n",
    "\n",
    "embed_model = NebiusEmbedding(api_key=NEBIUS_API_KEY)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Basic usage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4096\n",
      "[-0.0024051666259765625, 0.0083770751953125, -0.005413055419921875, 0.007396697998046875, -0.022247314453125]\n"
     ]
    }
   ],
   "source": [
    "text = \"Everyone loves justice at another person's expense\"\n",
    "embeddings = embed_model.get_text_embedding(text)\n",
    "assert len(embeddings) == 4096\n",
    "print(len(embeddings), embeddings[:5], sep=\"\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Asynchronous usage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4096\n",
      "[-0.0024051666259765625, 0.0083770751953125, -0.005413055419921875, 0.007396697998046875, -0.022247314453125]\n"
     ]
    }
   ],
   "source": [
    "text = \"Everyone loves justice at another person's expense\"\n",
    "embeddings = await embed_model.aget_text_embedding(text)\n",
    "assert len(embeddings) == 4096\n",
    "print(len(embeddings), embeddings[:5], sep=\"\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Batched usage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[-0.0003848075866699219, 0.0004799365997314453, 0.011199951171875]\n",
      "[-0.0037078857421875, 0.0114288330078125, 0.00878143310546875]\n",
      "[0.005924224853515625, 0.005153656005859375, 0.001438140869140625]\n",
      "[-0.009490966796875, -0.004852294921875, 0.004779815673828125]\n"
     ]
    }
   ],
   "source": [
    "texts = [\n",
    "    \"As the hours pass\",\n",
    "    \"I will let you know\",\n",
    "    \"That I need to ask\",\n",
    "    \"Before I'm alone\",\n",
    "]\n",
    "\n",
    "embeddings = embed_model.get_text_embedding_batch(texts)\n",
    "assert len(embeddings) == 4\n",
    "assert len(embeddings[0]) == 4096\n",
    "print(*[x[:3] for x in embeddings], sep=\"\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Async batched usage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[-0.0003848075866699219, 0.0004799365997314453, 0.011199951171875]\n",
      "[-0.0037078857421875, 0.0114288330078125, 0.00878143310546875]\n",
      "[0.005924224853515625, 0.005153656005859375, 0.001438140869140625]\n",
      "[-0.009490966796875, -0.004852294921875, 0.004779815673828125]\n"
     ]
    }
   ],
   "source": [
    "texts = [\n",
    "    \"As the hours pass\",\n",
    "    \"I will let you know\",\n",
    "    \"That I need to ask\",\n",
    "    \"Before I'm alone\",\n",
    "]\n",
    "\n",
    "embeddings = await embed_model.aget_text_embedding_batch(texts)\n",
    "assert len(embeddings) == 4\n",
    "assert len(embeddings[0]) == 4096\n",
    "print(*[x[:3] for x in embeddings], sep=\"\\n\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
